What is Claimed: 

A computer-implemented method of ranking the relevancy of a collection 
of hypertext pages to a keyword-based query, comprising: 
^calculating an intrinsic rank of a page; 
calculating an extrinsic rank of the page; and 

calculating the rank of the page by combining the intrinsic rank and the 
extrinsic rank. 

2. The method of claim 1 , wherein the intrinsic rank is a function of the 
content score a\d the page weight of the page. 

3. The methoa\pf claim 2, wherein the content score is a function of the 
frequency, location, and/or font size of a keyword in the page. 



4. The method of claim 2, wherein the page weight is defined as the 
probability of a user visiting the page when traveling in the collection of hypertext 
pages in a random fashionX 

5. The method of claim 2, wherein the page weight is obtained as the sum of 
the product of a link weight of each inbound link to the page and the page weight 
of the originating page. \ 

6. The method of claim 2, wherein the page weight is computed by the 
following steps of: \ 

constructing a connectivity graph\ which represents the collection of 
hypertext pages and the link structure beiween the pages; 

adding a page weight reservoir witmbi-directional links to and from each of 
the pages in the collection of hypertext pages; and 

summing all of the products of each inbound link weight with the page 
weight of the originating page providing the inbound link. 




7. lThe method of claim 2, further comprising computing the page weights by 
the following steps of: 

litializing a page weight vector to a constant; 

ionstructing a connectivity graph representative of the link structure of the 
collection of pages; 

cdmputing an output page weight vector from the input page weight vector 
and the connectivity graph; and 

comparing\the output page weight vector with the input page weight vector for 
convergence, and if convergence is reached, writing the output page weight 
vector in a page weight database, and if not, mixing the input and output page 
weight vectors to generate a new input page weight vector and repeating until 
convergence fip reached. 

8. The metriod of claim 5, wherein the link weight is defined as the probability 
of a user randomly choosing the link to visit other pages when traveling in the 
collection of hypertext pages. 

9. The metho\ of claim 5, wherein the link weight of the inbound links has a 
uniform value corresponding to the reciprocal of the total number of links 
outbound from an originating page. 

10. The method onclaim 5, wherein the link weight has a variable value, which 
depends on the number of outbound links, the offset of the link, the size of the 
paragraph where the lilpk is located, and/or whether the link is an external or 
internal link. 

1 1 . The method of cl^im 1 , wherein the extrinsic rank is a function of the 
anchor weight and the page weight of the pages providing inbound links to the 
page. 
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12. I The method of claim 1 , wherein the extrinsic rank is obtained by summing 
the products of the anchor weight and the page weight of the originating page 
providing each inbound link. 

1 3. \The method of claim 1 1 , wherein the anchor weight is a function of the 
inbound link weights and the keyword being present in the anchor text, in the 
vicinity of the anchor text, or in text related to the topic of the anchor text. 



14. Tlhe method of claim 1 1 , wherein the page weight is defined as the 
probability of a user randomly visiting a page in the collection of hypertext pages. 

15. Thp method of claim 1 1 , wherein the page weight is obtained by summing 
the products of the link weight of each inbound link to the page and the page 



weight of 



he originating page providing the inbound links. 



1 6. The method of claim 1 1 , wherein the page weight is computed by the 
following steps of: 

constructing a connectivity graph, which represents the collection of 
hypertext pages and the link structure between the pages; 

adding a page weight reservoir with bi-directional links to and from each of 
the pages inuhe collection of hypertext pages; and 

summing all of the products of each inbound link weight with the page 
weight of the originating page providing the inbound link. 



17. The method of claim 1 1 , further comprising computing the page weights 
by the followinglsteps of: 

initializing a page weight vector to a constant; 

constructing a connectivity graph representative of the link structure of the 
collection of pages; 

computing tan output page weight vector from the input page weight vector 
and the connectivity graph; and 
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comparing the output page weight vector with the input page weight vector 
for cor vergence, and if convergence is reached, writing the output page weight 
vector n a page weight database, and if not, mixing the input and output page 
weight vectors to generate a new input page weight vector and repeating until 
convergence is reached. 

18. The method of claim 15, wherein the link weight is defined as the 
probabil ty of a user randomly choosing the link to visit other pages when 
traveling in the collection of hypertext pages. 



19. Tne method of claim 15, wherein the link weight of the inbound links has a 
uniform value corresponding to the reciprocal of the total number of links 
outbound *fro m a n rmninatinn nane 



20. The method of claim 15, wherein the link weight has a variable value, 
which depends on the number of outbound links, the offset of the link, the size of 
the paragraph where the link is located, and/or whether the link is an external or 
internal linh 



21 . The method of claim 1 , wherein the collection of hypertext pages is 
fetched from the Web. 



22. A computer-implemented method of ranking a collection of hypertext 

pages, comprising: 

calculating the intrinsic rank of a page for a multi-keyword query; 
calculating the extrinsic rank of the page for the multi-keyword query; and 
calculating the rank of the page in the collection of hypertext pages by 

combining the intrinsic rank and the extrinsic rank. 



23. The method of claim 22, wherein the intrinsic rank is a function of content 
score and the pagft weight. 
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i4. The method of claim 23, wherein the content score is a function of the 
pbximity value of the multi-keywords and of the frequency, location, and/or font 
size of the multi-keywords in the page. 

25. \ The method of claim 22, wherein the extrinsic rank of the page is a 
function of the partial extrinsic ranks and the proximity value of the multi- 
keyworas. 

26. The method of claim 25, wherein partial extrinsic rank is a function of the 
anchor weight and the page weight of the pages with identical anchor text. 

27. The method of claim 25 ; wherein partial extrinsic rank is computed by 
summing the products of the anchor weight and the page weight of the pages 
with identical anchor text. 



28. A Web search engine, comprising: 
a web page database; 

a craWler to fetch pages from the Web and store the pages in the Web 
page database; 

a link extractor to extract link information from the pages; 

a URL managemefrtvsystem to assign an identification number to the URL 



of each page, and store 



entificatipn number and URL pairs in the Web page 
URLfe to thexrawler to be retrieved from the Web; 



database and send new 

anchor text and iWik da\hha(se] 

an anchor text an^HtrtK extractor to extract the anchor text and the link 
information from tme pagd|p and styre in the anchor text and link database; 

indexed database; 

an indexer to parse keywords from the pages and store the keyword and 
URL identification pairs in the indexed database; and 
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a ranker to rank 



a page based on intrinsic rank and extrinsic rank of the 



page. 



intrinsic rank from con 
weight computed from 
the extrinsic rank from 



29. The Web searc i engine of claim 28, wherein the ranker determines the 



ent information in the indexed database and the page 
the link information in anchor text and link database, and 
the anchor text information in the anchor text and link 



database and the computed page weight. 

30. The Web search engine of claim 28, wherein the ranker determines the 
intrinsic rank of the page based on the content score and the page weight. 



31 . The Web search engine of claim 28, wherein the ranker determines the 
extrinsic rank of the ^age based on the anchor weight of each inbound link and 
the page weight of t ie originating page. 

32. The Web search engine of claim 28, wherein the ranker determines the 
anchor weight based on the link weight and the keyword being present in the 
anchor text or related text. 



33. The Web search engine ofclaim 28, wh 
intrinsic rank and extrinsic rankyof a ^age f^ 
intrinsic rank is a function of content sWe 
rank of the page is a function of the>0afrial 
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in the ranker calculates the 
6 multi-keyword query, wherein the 
and the page weight, the extrinsic 
extrinsic ranks and proximity values. 



The Web search engine 6f claim 28, further comprising a page weight 
generator and a page weight database, computing page weights by initializing a 
page weight vector to a constant, constructing a connectivity graph representing 
the link structure of tne fetched pages, computing an output page weight vector 
from the input page weight vector and the connectivity graph, and comparing the 
output page weight vector with the input page weight vector and if convergence 
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is reached, writing the output page weight vector in a page weight database, and 
if not, mixing the input and output page weight vectors to generate a new input 
page weight Ivector and repeating until convergence is reached. 



35. A com 



iuter system for ranking search results from a query on a collection 



of hypertext f ages, comprising: 

a crawler i o fetch pages from the collection of hypertext pages; 

a link extractor to extract page locator information from the fetched pages; 

a page loc rtor management system for storing and retrieving the page locator 
information; 

a page database to store the pages; 

an indexer to parse keywords from the pages and store the keyword page 
locator pairs in the indexed database; 

an anchor text and link extractor to extract the anchor text and link structures 
from the pages; 

an anchor ext and link database, wherein the anchor text and link extractor 
writes the anc lor text and link structures into the anchor text and link database; 
and 

a ranker to assign a rank value to aypage based on intrinsic and extrinsic 
rank. 



36. The system of clai 
page based on\a combinktio 




erein the ranker assigns an intrinsic rank to the 
content score and page weight. 



37. The system of^ciaiAn 35, wherein the ranker assigns the content score to 
the page for a keyword bated on a combination of location, frequency, and/or 
font size of the keyword in \he page. 

38. The system of claim 35, wherein the ranker assigns a page weight to the 



page as the prob 



ibility of a searcher visiting the page when traveling in the 



collection of hypertext pages in a random fashion. 
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39. The system of claim 35, wherein the ranker assigns a uniform value 
correspond ng to the reciprocal of the total number of links outbound from an 
originating page to link weight. 

40. The system of claim 35, wherein the ranker assigns link weight based on 
location of t ie link. 

41 . The system of claim 35, wherein the ranker assigns an extrinsic rank to 
the page for a given keyword as a combination of anchor weight of the links from 
other pages and the page weight of referring pages. 

42. The sl/stem of claim 35 ? wherein the ranker assigns a rank value to a page 
for a multi-keyword query as a combination of intrinsic rank and extrinsic rank for 
the multi-ke\word. 



43. The 
page for a 
weight. 



system of claim 35, wherein the ranker assigns an intrinsic rank to a 
multi-keyword query as a combination p\ content score and page 



44. The system of cla 
page for a multi-keywo/d 
intersection oftthe given ke 



wherein the ranker assigns a content score to a 
queryvas'a combination of content score based on 
is and proximity value. 



45. The system of darrn 35, wherein the ranker assigns a partial extrinsic rank 
for each variation of/identrcaLanchor text. 



46. The system of claim B5, wherein the ranker assigns a extrinsic rank to a 
page for a multi-Keyword query as a combination of partial extrinsic rank of 
identical anchor (ext and proximity values in each anchor text. 
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47. The system of claim 35, wherein the ranker obtains a link connectivity 
graph of thelpages. 

47. The system of claim 35, wherein the ranker obtains the rank values from 
the link connectivity graph. 

48. The system of claim 35, wherein the ranker calculates the page weight by 
iterative numerical procedure. 

49. The system of claim 35, wherein the ranker accelerates the convergence 
of the iterative numerical procedure in obtaining connectivity rank scores. 

50. The system\of claim 35, wherein the rank^f calculates rank values by 
dividing the pages mto distinct number of groups. 

51 . The system or claim 35, further coi^lprising a rate controller to control the 
rate of request for page retrieval. 



52. The system of dlairtf 35, ^vh^rein^he Web page database stores the pages 
in a fixed record large Ano^jgh^ a predetermined percentage of all of the 

pages, wherein if the page is smeller, the fixed record has some empty space, 
and if the page is larger, the Wea page database stores as much of the page as 
possible in the fixed record and tne rest in a record file. 
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